On the Semiparametric Efficiency of the Scott-Wild Estimator under Choice-Based and Two-Phase Sampling
نویسنده
چکیده
Suppose that for each of a number of subjects, we measure a response y and a vector of covariates x, in order to estimate the parameters β of a regression model which describes the conditional distribution of y given x. If we have sampled directly from the conditional distribution, or even the joint distribution, we can estimate β without knowledge of the distribution of the covariates. In the case of a discrete response, which takes one of J values y1, . . . , yJ , say, we often estimate β using a case-control sample, where we sample from the conditional distribution of X given Y = yj . This is particularly advantageous if some of the values yj occur with low probability. In case-control sampling, the likelihood involves the distribution of the covariates, which may be quite complex, and direct parametric modelling of this distribution may be too difficult. To get around this problem, the covariate distribution can be treated non-parametrically. In a series of papers, (Scott and Wild 1986, 1997, 2001, Wild 1991) Scott and Wild have developed an estimation technique which yields a semi-parametric estimate of β. They dealt with the unknown distribution of the covariates by profiling it out of the likelihood, and derived a set of estimating equations whose solution is the semi-parametric estimator of β. This technique also works well for more general sampling schemes, for example for two-phase outcome-dependent stratified sampling. Here, the sample space is partitioned into S disjoint strata which are defined completely by the values of the response and possibly some of the covariates. In the first phase of sampling, a prospective sample of size N is taken from the joint distribution of x and y, but only the stratum the individual belongs to is observed. In the second phase, for s = 1, . . . , S, a sample of size n 1 is selected from the n (s) 0 individuals in stratum s who were selected in the first phase, and the rest of the covariates are measured. Such a sampling scheme can reduce the cost of studies by confining the measurement of expensive variables to the
منابع مشابه
Research Article On the Semiparametric Efficiency of the Scott-Wild Estimator under Choice-Based and Two-Phase Sampling
متن کامل
On the Breslow-Holubkov estimator.
Breslow and Holubkov (J Roy Stat Soc B 59:447-461 1997a) developed semiparametric maximum likelihood estimation for two-phase studies with a case-control first phase under a logistic regression model and noted that, apart for the overall intercept term, it was the same as the semiparametric estimator for two-phase studies with a prospective first phase developed in Scott and Wild (Biometrica 84...
متن کاملGeneralized Ridge Regression Estimator in Semiparametric Regression Models
In the context of ridge regression, the estimation of ridge (shrinkage) parameter plays an important role in analyzing data. Many efforts have been put to develop skills and methods of computing shrinkage estimators for different full-parametric ridge regression approaches, using eigenvalues. However, the estimation of shrinkage parameter is neglected for semiparametric regression models. The m...
متن کاملKernel Ridge Estimator for the Partially Linear Model under Right-Censored Data
Objective: This paper aims to introduce a modified kernel-type ridge estimator for partially linear models under randomly-right censored data. Such models include two main issues that need to be solved: multi-collinearity and censorship. To address these issues, we improved the kernel estimator based on synthetic data transformation and kNN imputation techniques. The key idea of this paper is t...
متن کاملEstimating Multiple Treatment Effects Using Two-phase Regression Estimators
We propose a semiparametric two-phase regression estimator with a semiparametric generalized propensity score estimator for estimating average treatment effects in the presence of informative first-phase sampling. The proposed estimator can be easily extended to any number of treatments and does not rely on a prespecified form of the response or outcome functions. The proposed estimator is show...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JAMDS
دوره 2007 شماره
صفحات -
تاریخ انتشار 2007